A MEDLINE categorization algorithm
نویسندگان
چکیده
BACKGROUND Categorization is designed to enhance resource description by organizing content description so as to enable the reader to grasp quickly and easily what are the main topics discussed in it. The objective of this work is to propose a categorization algorithm to classify a set of scientific articles indexed with the MeSH thesaurus, and in particular those of the MEDLINE bibliographic database. In a large bibliographic database such as MEDLINE, finding materials of particular interest to a specialty group, or relevant to a particular audience, can be difficult. The categorization refines the retrieval of indexed material. In the CISMeF terminology, metaterms can be considered as super-concepts. They were primarily conceived to improve recall in the CISMeF quality-controlled health gateway. METHODS The MEDLINE categorization algorithm (MCA) is based on semantic links existing between MeSH terms and metaterms on the one hand and between MeSH subheadings and metaterms on the other hand. These links are used to automatically infer a list of metaterms from any MeSH term/subheading indexing. Medical librarians manually select the semantic links. RESULTS The MEDLINE categorization algorithm lists the medical specialties relevant to a MEDLINE file by decreasing order of their importance. The MEDLINE categorization algorithm is available on a Web site. It can run on any MEDLINE file in a batch mode. As an example, the top 3 medical specialties for the set of 60 articles published in BioMed Central Medical Informatics & Decision Making, which are currently indexed in MEDLINE are: information science, organization and administration and medical informatics. CONCLUSION We have presented a MEDLINE categorization algorithm in order to classify the medical specialties addressed in any MEDLINE file in the form of a ranked list of relevant specialties. The categorization method introduced in this paper is based on the manual indexing of resources with MeSH (terms/subheadings) pairs by NLM indexers. This algorithm may be used as a new bibliometric tool.
منابع مشابه
Chi-square-based scoring function for categorization of MEDLINE citations
OBJECTIVES Text categorization has been used in biomedical informatics for identifying documents containing relevant topics of interest. We developed a simple method that uses a chi-square-based scoring function to determine the likelihood of MEDLINE citations containing genetic relevant topic. METHODS Our procedure requires construction of a genetic and a nongenetic domain document corpus. W...
متن کاملImproving the Operation of Text Categorization Systems with Selecting Proper Features Based on PSO-LA
With the explosive growth in amount of information, it is highly required to utilize tools and methods in order to search, filter and manage resources. One of the major problems in text classification relates to the high dimensional feature spaces. Therefore, the main goal of text classification is to reduce the dimensionality of features space. There are many feature selection methods. However...
متن کاملAutomatic Text Categorization and Its Applicationto Text
We develop an automatic text categorization approach and investigate its application to text retrieval. The categorization approach is derived from a combination of a learning paradigm known as instancebased learning and an advanced document retrieval technique known as retrieval feedback. We demonstrate the e ectiveness of our categorization approach using two real-world document collections f...
متن کاملTransaction / Regular Paper Title
Search queries on biomedical databases, such as PubMed, often return a large number of results, only a small subset of which is relevant to the user. Ranking and categorization, which can also be combined, have been proposed to alleviate this information overload problem. Results categorization for biomedical databases is the focus of this work. A natural way to organize biomedical citations is...
متن کاملEnhancing Patent Expertise through Automatic Matching with Scientific Papers
This paper focuses on a subtask of the QUAERO research program, a major innovating research project related to the automatic processing of multimedia and multilingual content. The objective discussed in this article is to propose a new method for the classification of scientific papers, developed in the context of an international patents classification plan related to the same field. The pract...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- BMC Medical Informatics and Decision Making
دوره 6 شماره
صفحات -
تاریخ انتشار 2006